dpp runs the knesset data pipelines periodically on our server.
This notebook shows how to run pipelines that render a single committee meeting page, part of the committees/dist pipelines which generate the static website at https://oknesset.org
In [1]:
import sys
sys.stdout = sys.__stdout__
In [2]:
import os
if os.getcwd() != '/pipelines':
os.chdir('..')
PIPELINES_ROOT_DIR = os.getcwd()
print(PIPELINES_ROOT_DIR, file=sys.stderr)
In [3]:
!{'KNESSET_LOAD_FROM_URL=1 dpp run --concurrency 4 '\
'./committees/kns_committee,'\
'./people/committee-meeting-attendees,'\
'./members/mk_individual'}
This pipeline aggregates the relevant data and allows to filter for quicker development cycles.
You can uncomment and modify the filter step in committees/dist/knesset.source-spec.yaml under the build
pipeline to change the filter.
The build pipeline can take a few minutes to process for the first time.
In [ ]:
!{'dpp run --verbose ./committees/dist/build'}
In [ ]:
from datapackage_pipelines_knesset.committees.dist.meeting_context import get_meeting_context_data
meeting_context_data = get_meeting_context_data()
Jinja is used to render the html templates
In [ ]:
from datapackage_pipelines_knesset.committees.dist.template_functions import get_jinja_env
jinja_env = get_jinja_env('committees/dist/templates')
In [ ]:
from datapackage import Package
import yaml, sys
build_meetings_package = Package('data/committees/dist/build_meetings/datapackage.json')
meeting_rows_generator = build_meetings_package.get_resource('kns_committeesession').iter(keyed=True)
meeting_rows_generator = (r for r in meeting_rows_generator if r['KnessetNum'] == 20)
# filter meetings which have more then 2 attended mks
meeting_rows_generator = (r for r in meeting_rows_generator if len(r['attended_mk_individual_ids']) > 2)
meeting_row = next(meeting_rows_generator)
print(yaml.dump(meeting_row, allow_unicode=True, default_flow_style=False), file=sys.stderr)
In [ ]:
import os, sys
from datapackage_pipelines_knesset.committees.dist.template_functions import build_template
from datapackage_pipelines_knesset.committees.dist.committees_common import get_meeting_path
from datapackage_pipelines_knesset.committees.dist import meeting_context
# reload the meeting_context module in case you made some changes to it.
# Allows for quicker iterations while keeping all the source data in RAM
from importlib import reload
reload(meeting_context)
meeting_html_file_base_path = get_meeting_path(meeting_row)
build_template(jinja_env,
"committeemeeting_detail.html",
meeting_context.get_meeting_context(meeting_row, meeting_context_data, use_data=False),
meeting_html_file_base_path,
output_root_dir='data/committees/dist/dist')
In [ ]:
!{'dpp run ./committees/dist/copy_static_files'}
To view the output, run the following from the host PC (outside of the Docker container):
Change to the relevant directory on the host PC:
cd /opt/knesset-data-pipelines/data/committees/dist/dist
Run a simple HTTP server using Python:
python3 -m http.server
The following script prints the relevant urls to view the output:
In [5]:
print('\n-- rendered meeting - localhost url --\n', file=sys.stderr)
print(f'http://localhost:8000/{meeting_html_file_base_path}', file=sys.stderr)
print('\n-- rendered meeting - production url --\n', file=sys.stderr)
print(f'https://oknesset.org/{meeting_html_file_base_path}', file=sys.stderr)